PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining

نویسندگان

  • Sanjay Goil
  • Alok N. Choudhary
چکیده

Multidimensional analysis and online analytical processing (OLAP) operations require summary information on multidimensional data sets. Most common are aggregate operations along one or more dimensions of numerical data values. Simultaneous calculation of multidimensional aggregates are provided by the Data Cube operator, used to calculate and store summary information on a number of dimensions. This is computed only partially if the number of dimensions is large. Query processing for these applications requires different views of data to gain insight and for effective decision support. Queries may either be answered from a materialized cube in the data cube or calculated on the fly. The multidimensionality of the underlying problem can be represented both in relational and in multidimensional databases, the latter being a better fit when query performance is the criteria for judgment. Relational databases are scalable in size for OLAP and multidimensional analysis and efforts are on to make their performance acceptable. On the other hand multidimensional databases have proven to provide good performance for such queries, although they are not very scalable. In this article we address (1) scalability in multidimensional systems for OLAP and multidimensional analysis and (2) integration of data mining with the OLAP framework. We describe our doi:10.1006 jpdc.2000.1691, available online at http: www.idealibrary.com on

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Parallel Scalable Infrastructure for OLAP and Data Mining

Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health-care among many others. The multi-dimensional aspects of a business can be naturally expressed using a multi-dimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and da...

متن کامل

An Infrastructure for Scalable Parallel Multidimensional Analysis

Multidimensional Analysis in On-Line Analytical Processing (OLAP), and Scientific and statistical databases (SSDB) use operations requiring summary information on multi-dimensional data sets. Most common are aggregate operations along one or more dimensions of numerical data values and/or on hierarchies defined on them. Simultaneous calculation of multi-dimensional aggregates are provided by th...

متن کامل

Session / Séance 34-B Development of Interactive Tools for the Exploration of Large Geographic Databases

The methodology of parallel coordinate geometry is presented and adapted for the visualization of multidimensional geographic information. Large, multidimensional spatiotemporal data sets common in the environmental sciences require exploratory methods of analysis in addition to more deductive approaches. Representation techniques for such an inductive approach must facilitate visualization of ...

متن کامل

Large-Scale Multidimensional Data Visualization: A Web Service for Data Mining

In this paper, we present an approach of the Web application (as a service) for data mining oriented to the multidimensional data visualization. The stress is put on visualization methods as a tool for the visual presentation of large-scale multidimensional data sets. The proposed implementation includes five visualization methods: MDS SMACOF algorithm, Relative MDS, Diagonal majorization algor...

متن کامل

Grid - based Distributed Data Mining Systems , Algorithms and Services ∗

Distribution of data and computation allows for solving larger problems and execute applications that are distributed in nature. The Grid is a distributed computing infrastructure that enables coordinated resource sharing within dynamic organizations consisting of individuals, institutions, and resources. The Grid extends the distributed and parallel computing paradigms allowing resource negoti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 61  شماره 

صفحات  -

تاریخ انتشار 2001